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I.  Introduction 

RADC  is  the  focal  point  for  reliability  assessment  and  assurance  in  the  Air 
Force.  Of  the  many  functions  that  this  responsibility  entails,  the  facilities  and 
capabilities  for  microcircuit  reliability  characterization  through  the  use  of 
accelerated  stress  testing  is  the  setting  of  this  TM.  Specifically,  this  report  has 
been  generated  to  document  what  has  been  accomplished  in  the  past  seven  years  to 
enhance  the  capabilities  of  the  Accelerated  Stress  Facility  (ASF)  at  RADC  and  to 
give  insight  as  to  what  is  necessarily  the  future  in  stress  testing.  It  is  understood 
that  RADC  is  committed  to  continuing  to  provide  an  in-house  facility  capable  of 
performing  reliability  characterizations  of  microelectronics  in  support  of  Air  Force 
and  DOD  programs.  It  is  understood  that  the  facility  is  not  a  high  volume,  routine 
type  test  house,  but  rather  a  facility  to  respond  with  reliability  characterizations 
on  high  cost,  small  volume,  state-of-the-art  or  custom  applications  of  technology 
or  the  study  of  the  emerging  technology  itself.  It  is  also  understood  that  if  RADC 
is  to  continue  to  perform  reliability  characterization,  we  must  make  use  of 
automation  in  all  phases  of  that  process,  including  the  stress  testing  capability. 


II.  From  Static  to  SMART 


The  goal  of  stress  testing  is  to  stimulate  potential  failure  mechanisms.  This  is 
often  done  at  stress  levels  different  than  those  expected  in  actual  application  to 
effect  an  acceleration  of  the  mechanism.  Typical  stressors  are  temperature, 
voltage,  and  humidity. 


Elevated  temperature  testing  is  the  most  commonly  used  stress.  Typical 
microcircuit  stressing  techniques  can  be  visualized  as  shown  in  Figure  1.  The  most 
elementary  is  the  storage  bake  where  elevated  temperature  is  used  as  the  sole 
stressor  for  some  period  of  time.  This  is  effective  for  stabilization  of  device 
parameters  and  evaluation  of  epoxies  and  polyimides  used  for  die  attach  and  chip 
coatings. 
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Figure  l:  Historical  Stress  Concepts 


The  addition  of  static  voltages  provides  electric  fields  which  can  accelerate 
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oxide  breakdown,  charge  accumulation,  corrosion,  and  material  migration  in 
addition  to  the  temperature  effects.  Static  stress  is  the  most  widely  used  stress 
since  it  is  effective  for  accelerating  many  failure  mechanisms  and  can  be  applied 
with  relatively  simple  test  hardware. 

For  larger  more  complex  devices  with  internal  nodes  that  are  not  directly 
accessible  from  the  package  pins,  and  devices  which  are  dynamic  by  design  such  as 
dynamic  RAM's,  dynamic  stress  is  required  to  stimulate  the  potential  failure 
mechanisms.  This  technique  seeks  to  exercise  the  nodes  of  the  device  to 
accelerate  mechanisms  such  as  charge  injection,  leakage,  and  oxide  breakdown 
throughout  the  device.  Purely  static  bias  allows  uncontrolled  and  unknown  bias 
conditions  to  exist  on  the  internal  device  nodes.  A  static  bias  is  not  capable  of 
producing  the  charge/discharge  currents,  or  switching  transients.  Dynamic  bias  is 
increasingly  recognized  as  essential  particularly  as  device  geometries  shrink  and 
complexity  increases.  It  is,  however,  considerably  more  involved  in  expense  and 
test  hardware. 

It  was  this  need  for  dynamic  capability  experienced  during  the  in-house  MX- 
ACT  I  study  that  motivated  an  upgrade  of  the  RADC  stress  test  facility.  It  was 
also  recognized  that  the  required  labor  and  material  could  be  reduced  for  both 
static  and  dynamic  testing  through  better  design. 

Prior  to  1977  a  patch  board  arrangement  as  shown  in  Figure  2  was  in  use.  This 
approach  was  useful  for  static  bias  but  very  labor  intensive  to  set  up  and  not 
suitable  for  dynamic  testing. 
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Beginning  in  1977,  in-house  oven  modifications  were  started  to  provide 
dynamic  exercise  capability  and  to  reduce  the  labor  requirements.  Two  thermal 
chamber  doors  were  modified  with  six  inch  feed  through  boards  to  card  racks  on 
both  sides.  High  temperature  device  and  feed  through  boards  were  designed  and 
fabricated  with  nickel  clad  polyimide  materials  capable  of  250°C  operation.  Each 
through  door  socket  assembly  was  seventy  pins  wide.  These  sockets  were  arranged 
in  three  columns  of  ten  each.  Capacities  per  door  were  thirty,  sixty  four  pin 
devices  to  one  hundred  and  fifty,  fourteen  pin  devices.  Figure  3  shows  the  door 
configuration.  Devices  under  test  (DUT)  are  biased  via  drive  boards  on  the  exterior 
of  the  chamber.  As  shown  in  Figure  4,  the  drivers,  which  are  identical  copies  of 
each  other,  are  interconnected  with  mass  termination  ribbon  cable  and  fed  from  a 
common  pattern  source.  The  pattern  source  in  this  figure  is  located  in  the  top  slot 
of  the  left  hand  column  of  boards.  The  regularity  of  the  support  circuitry  and 
interconnections  greatly  reduced  the  assembly  and  maintenance  time.  Wire  wrap 
connections  were  made  standard,  being  quick  and  flexible  in  the  small  volume 
environment  of  the  ASF.  The  figure  also  illustrates  the  use  of  an  ac-dc  disturbance 
analyzer  with  internal  clock  and  printer  monitoring  the  ac  line  voltage  and  the  dc 
power  supply  to  the  devices  under  test.  This  is  useful  in  detecting  and  identifying 
potentially  damaging  transients  which  may  occur  during  testing.  On  top  of  the 
chamber  is  a  frequency  counter  which  monitors  the  operation  of  an  exercise  signal. 

A  universal  DUT  board  was  designed  to  accept  dual-in-line  packages  with 
widths  of  300,  400,  and  600  mils.  The  basic  board  is  shown  in  Figure  5  and  some 
applications  are  seen  in  Figure  6. 


3:  Chambf-r  Thru-Door  Impli'iriont  .it  ion 


Figure  4:  Thru-Hoor  Applie.it ion 


Figure  5:  Universal  DUT  Board 


These  modifications  made  dynamic  stressing  up  to  1MHz  practical  and  greatly 
simplified  the  process  of  setting  up  the  stress  tests. 

These  techniques,  high  temperature  storage,  static  stress,  and  dynamic  stress 
are  continually  used  to  study  the  potential  failure  modes  and  mechanisms  in 
semiconductors.  The  accelerated  changes  are  not  usually  known  until  the  DUTs  are 
removed  from  the  chamber  and  evaluated  on  Automated  Microcircuit  Test 
Equipment  (AMTE).  These  tests  often  include  parametric  measurement  of  leakage 
currents,  output  drive  capability,  and  functional  verification.  The  functional 
testing  on  AMTE  is  intended  to  determine  with  a  high  degree  of  accuracy  and 
precision  that  the  DUT  provides  the  output  that  is  expected,  when  it  is  expected, 
for  a  given  set  of  conditions  including  input  pattern,  voltage,  timing,  and 
temperature. 
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igure  6:  Socketed  Universal  UUT  Board 


If  output  monitoring  circuitry  were  added  to  the  stress  test  hardware, 
functional  information  could  be  obtained  on  the  DUTs  in  the  test  chamber  during 
the  stress  test. 

This  would  allow  many  possibilities  for  increasing  the  device  information. 
First,  a  major  problem  with  AMTE  testing  is  the  time  and  cost  involved  in  testing  a 
device.  Whether  the  device  is  good  or  bad  each  device  out  of  the  stress  chamber  is 
tested  to  some  degree  on  the  AMTE.  AMTE  basically  tests  only  one  or  at  most  a 
few  devices  at  a  time  generally  in  a  round  robin  fashion.  Test  times  in  excess  of  a 
few  seconds  to  a  few  minutes  each  are  not  affordable.  In  conventional  approaches 
to  testing,  more  complex  devices  require  increasingly  more  complex  test 
algorithms.  The  execution  time  for  these  tests  quickly  forces  an  extreme 
limitation  of  the  testing  that  is  afforded. 

Time  is  not  so  much  of  an  issue  in  stress  testing.  Current  burn-in  models  such 
as  specified  in  MIL-STD-883  specify  160  hours  at  125°  for  Class  B  devices.  Life 
tests  are  usually  conducted  for  1000  hours,  4000  hours,  or  even  longer.  The  ability 
to  functionally  evaluate  devices  in  the  chamber  would  therefore  allow  more 
extensive  device  evaluations  to  be  performed.  Data  on  the  whole  population  of 
devices  under  test  could  be  gathered  during  the  stress  test  and  made  available  for 
analysis.  The  fact  that  in  the  past  the  effects  of  the  stressors  were  not  known 
until  the  parts  were  tested  one  at  a  time  on  the  AMTE  at  some  predetermined  test 
interval  was  not  a  function  of  the  information  not  being  available.  The  data 
merely  was  not  being  accessed.  The  data  is  available.  The  time  is  available.  The 
information  needs  to  be  gathered. 
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Secondly,  much  of  the  expense  of  the  AMTE  is  the  accuracy,  precision,  and 
speed  required  to  measure  any  given  parameter.  The  actual  value  often  has  no 
informational  content.  Were  the  system  considerably  less  sophisticated,  the 
informational  content  would  not  be  greatly  changed. 

The  test  hardware  at  the  chamber  can  address  this  not  so  demanding  testing 
area.  The  fact  that  clock  rate  and  timing  edges  will  be  slowed  because  of  physical 
lengths,  loading,  etc.,  is  not  of  overriding  concern.  The  incorporation  of  test 
capability  at  the  chamber  is  not  intended  to  do  away  with  AMTE.  The  testing  at 
the  chamber  can  make  the  AMTE  more  productive.  The  chamber  test  capability 
can  bound  operating  regions,  identify  devices  that  are  obviously  good,  no  good,  or 
requiring  better  capability  to  classify.  By  example,  if  a  part  fails  to  function  in 
the  chamber  is  it  worth  the  expense  of  the  AMTE?  If  an  output  appears  between 
75nS  and  50nS  earlier  than  the  specification  is  it  important  for  the  actual  time  of 
62.5nS  early  to  be  established?  On  the  whole  population  of  devices?  For 
characterization  purposes  of  a  sample  yes.  For  failure  analysis,  perhaps.  For 
general  product  testing  on  the  AMTE  for  all  measurements  for  all  parts  for  all 
temperatures  whether  good  or  bad,  no.  Such  a  use  is  a  waste.  The  chamber  test 
capability  can  be  a  screen  to  make  the  AMTE  more  efficient  and  productive. 

An  extension  of  this  would  allow  the  AMTE  test  program  to  be  a  unique 
concatenation  of  only  those  tests  which  are  required  to  complete  the  database  of 
information  for  each  specific  device.  The  test  capability  at  the  chamber  fills  in 
the  less  demanding  and  more  time  consuming  data.  The  AMTE  fills  in  the 
demanding  difficult  data.  The  demand  on  the  AMTE  is  reduced.  The  total 
information  available  on  the  device  is  increased. 


There  are  other  advantages  to  the  incorporation  of  test  capability  at  the 
chamber.  One  of  these  has  to  do  with  knowing  that  the  DlJTs  are  in  fact  being 
stressed  as  intended.  Escapes  are  those  devices  which  elude  the  stress  or  the 
detection  of  the  incidence  of  failure.  The  detection  of  a  correct  device  function 
provides  a  means  to  insure  that  the  electrical  bias  is  being  applied  to  each  DUT  and 
that  the  operating  region  of  the  DlJTs  is  not  being  exceeded  by  the  stressors. 
Devices  can  be  damaged  by  incomplete  electrical  bias  caused  by  broken  solder 
joints,  corroded  socket  contacts,  etc.,  or  at  least  subjected  to  a  subset  of  the 
intended  stressors  and  escape  the  screen  to  fail  later  in  application.  Another  area 
of  escapes  relates  to  devices  which  recover  to  with  i  specified  limits  prior  to 
testing  on  AMTE.  For  either  case,  testing  at  the  chamber  offers  the  opportunity  of 
catching  these  devices. 

Another  important  type  of  failure  is  intermittents.  Interinittents  due  to 
thermal  ramping  of  the  chamber  usually  escape  dr  ction.  MIL-STD-883  currently 
calls  out  procedures  for  checking  signal  continuity.  The  method  basically  addresses 
stabilized  temperatures  and  represents  a  compromise  of  continuity  assurance  with 
the  practical  problems  of  acquiring  that  assurance.  Testing  at  the  chamber  will 
provide  the  assurance  of  continuity. 

Testing  at  the  chamber  also  offers  the  possibility  to  detect  another 
intermittent  fault,  namely  soft  errors.  Soft  error  detection  requires  many  device 
hours  to  be  accumulated.  Testing  time  and  the  number  of  device  under  test  make 
this  possible  in  the  stress  chamber. 
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One  of  the  major  definitions  of  failure  is  nonfunctionality.  Another  has  to  do 
with  stability.  The  ability  to  test  at  the  chamber  will  provide  the  ability  to 
monitor  device  stability  throughout  the  stress  test,  not  just  at  the  end  points. 
Monitoring  throughout  the  stress  test  will  provide  the  ability  to  better  characterize 
the  fallout  during  the  test,  through  better  statistics  and  better  correlation  with 
failure  mechanism.  Such  insight  with  corrective  action  will  lead  to  improved  yield. 

As  package  dimensions  increase  the  thermal  mass  increases  and  consequently 
the  time  required  to  achieve  thermal  equilibrium  with  the  ambient.  In  the  chamber 
all  parts  are  exposed  with  bias  to  the  thermal  environment.  Testing  can  be 
performed  during  the  thermal  transitions  as  well  as  with  confidence  that  the 
devices  are  at  thermal  equilibrium. 

Table  1  summarizes  some  of  the  advantages  that  testing  at  the  chamber  makes 
possible. 

TABLE  1:  ADVANTAGES  OF  CHAMBER  TEST 

Extensive  device  evaluation 
Data  during  the  stress  test 
Reduced  AMTE  demand 
Assurance  of  stress  test  integrity 
Hard/soft  error  detection 
Improved  statistical  data 
Thermal  stability 


The  design  and  conduct  of  reliability  stress  tests  has  been  labor  intensive.  The 
advent  of  dynamic  testing  represents  an  increase  in  complexity  and  monitored 
dynamic  testing  another  major  increase.  More  complex  devices  necessitate 
increased  testing  which  further  compounds  the  problems.  There  are  more  device 
pins  and  more  device  functions.  Testing  at  the  chamber  is  an  answer.  In  order  to 
implement  testing  at  the  chamber  the  stress  test  must  be  automated. 

It  should  be  recognized  that  the  complex  circuits  requiring  testing  are 
intended  to  automate  solutions;  to  reduce  the  labor;  to  perform  the  function 
better;  to  make  practical  new  functions.  By  making  use  of  these  devices  such  as 
microprocessors,  microcomputers,  memory  devices;  the  new  complex  devices,  the 
testing  process  can  be  profitably  automated.  This  results  in  not  only  allowing  but 
expanding  the  abilities  to  assess  device  reliability  while  reducing  the  manpower, 
and  increasing  the  throughput.  The  integration  of  the  computer  will  bring  the 
testing  problem  back  down  to  size  and  provide  many  significant  additional  tools 
enabling  reliability  characterizations  to  be  performed. 

What  is  needed  is  automated  monitored  reliability  stress  testing  (AMRST).  It 
is  simply  the  integration  of  three  elements.  (1)  Electronics  that  are  capable  of 
controlling  the  stressors  applied  to  the  devices  under  test.  (2)  Electronics  that  are 
capable  of  monitoring  the  response  of  the  devices  under  test.  (3)  Computer 
technology  to  intelligently  interface  these  elements. 

The  key  is  the  integration  of  computer  technology.  The  test  definition  is  then 
in  software,  not  hardware.  To  change  the  test  definition  one  changes  the  software 
which  is  quicker  and  easier  than  wiring  a  new  drive  board.  Stress  testing  involves 
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waiting  for  something  to  change.  Recognizing  that  change  at  the  chamber 
involves  gathering  lots  of  data.  Computers  are  excellent  for  repetitive  programs 
and  handling  and  sorting  of  data.  Many  of  the  time  consuming,  error  prone  details 
of  actually  conducting  a  stress  test  can  be  accomplished  far  better  by  the 
computer.  One  of  the  most  advantageous  features  however,  is  that  computers  can 
make  the  stress  test  adaptive.  Interfacing  the  controlling  and  monitoring 
capabilities  via  software  will  make  possible  an  adaptive  system  whereby  the 
stressors  can  be  tailored  to  the  response  of  the  devices  under  test.  Thus,  stress 
tests  could  be  customized  to  each  particular  lot  of  devices  under  test.  "Burn-in  to 
order",  to  a  particular  failure  rate,  or  failure  free  period  will  be  possible.  The 
identification  of  bad  lots  of  product  can  be  made  early  to  allow  replacement  with 
product  that  can  be  profitably  tested.  The  stress  test  can  be  made  to  fit  the  parts 
and  target  reliability  rather  than  a  generalized  model. 

Incorporating  the  possibilities  of  AMRST  into  the  capabilities  of  the  ASF  at 
RADC  has  become  the  major  emphasis  for  the  past  five  years.  Various 
experiments  have  been  performed  to  recognize  and  develop  some  of  those 
possibilities.  What  was  soon  recognized  was  that  the  integration  of  the  computer 
into  stress  testing  is  only  the  beginning.  The  opportunities  that  present  themselves 
for  optimized  testing,  enhanced  data  analysis,  and  more  efficient  and  effective 
reliability  testing  are  as  tremendous  as  they  are  necessary.  To  put  it  another  way, 
AMRST  is  really  "SMART"  testing.  The  next  section  deals  with  in-house 
experiments  and  results. 


III.  In-house  SMART  Work 


Devices  of  VLSI  complexity,  capability,  size,  and  "production"  volume  coupled 
with  the  increased  emphasis  on  custom  devices,  short  device  development  schedule, 
and  increased  labor  costs  make  reliability  characterization  more  difficult. 
Junction  density,  current  density,  electromagnetic  coupling,  thermal  coupling, 
latchup,  and  smaller  processing  margins  make  reliability  assessment  more 
necessary.  To  get  the  needed  information  new  methods  have  to  be  devised.  New 
techniques  for  obtaining  the  necessary  information  from  less  data  in  less  time  must 
be  developed.  New  methods  of  handling  all  this  information  must  be  utilized. 

The  key  to  these  needs  is  automation  and  in  particular  the  integration  of 
computers. 

The  initial  in-house  application  of  computers  in  the  stress  test  process  took 
place  in  1979.  This  involved  a  microprocessor  evaluation  kit  based  on  the  CMOS 
1802  microprocessor.  The  evaluation  kit  consisted  of  the  processor,  0.5K  RAM, 
some  parallel  I.O,  an  RS232C  interface,  and  a  IK  ROM  with  minimal  utilities  and 
monitor  programs.  All  programming  was  done  in  machine  language.  The  CMOS 
microprocessor  system  allowed  a  wide  single  operating  voltage  (3-15V)  facilitating 
interface  requirements  to  devices  under  test,  clock  rates  of  DC  to  5MHz,  single 
step  operation,  and  was  a  fast  emerging,  promising  technology.  The  evaluation  kit 
also  offered  LED  monitor  lights  on  the  data  (8)  and  address  (16)  busses,  control 
lines,  state  lines,  and  serial  output  line  for  debug  purposes,  user  work  space,  and  a 
capability  to  expand  the  memory  on  the  single  pc  board  to  4 K.  The  cost  was  $250. 


Figure  7:  1802  Evaluation  Board 
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Figure  7  is  a  photo  of  the  evaluation  board  with  an  auxiliary  keyboard 
terminal. 

The  first  automated  monitored  stress  testing  experiment  was  performed  in 
1980  on  some  developmental  MNOS  memories  configured  32  x  8.  The  objective  was 
to  study  the  retention  of  the  MNOS  memory  as  a  function  of  elevated  temperature, 
number  of  read  cycles,  and  duration  of  the  write  cycle.  The  test  was  concerned 
with  stimulating  the  loss  of  stored  charge  which  would  evidence  itself  by  incorrect 
data. 

The  test  setup  is  shown  in  Figures  8  and  9.  The  evaluation  kit  was 
programmed  to  generate  the  necessary  control  logic  waveforms,  addresses,  and 
expected  data.  These  signals  were  supplied  to  the  interface  board  located  in  the 
bottom  slot  of  the  right  hand  column  in  Figure  9.  A  block  diagram  for  the  hardware 
is  shown  in  Figure  10.  Referring  to  Figure  9,  the  signals  were  buffered  and  drove 
the  devices  under  test  in  parallel  via  the  left  hand  ribbon  cable.  The  OUT  outputs 
were  compared  with  the  circuitry  on  each  driver  card  by  an  XNOR  and  AND 
circuit.  The  resultant  signal  for  each  memory  was  cabled  back  to  the  interface 
board  via  the  right  hand  ribbon  cable  where  the  results  were  combined  by  another 
AND  circuit  to  generate  a  fail  signal.  Upon  detection  of  this  fail  signal  the  uP 
would  halt  the  generation  of  the  exercising  signal  and  input  the  individual  failed 
device  signal  as  well  as  the  address  of  the  fault.  The  uP  would  then  process  this 
information  to  determine  if  tnis  was  a  first  time  failure  for  the  particular  device 
and  location  or  a  repeat  occurrence.  The  results  were  stored  in  either  a  temporary 
or  permanent  failure  array.  Upon  completion  of  the  analysis  testing  was  resumed. 
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Figure  10:  MNOS  Automated  Monitored 

Stress  System  Block  Diagram 
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Once  a  device  address  was  determined  to  be  a  repeated  failure  the  uP  would 
issue  an  error  mask  bit  to  the  interface  board  inhibiting  subsequent  fault  detections 
for  that  specific  device  and  address. 

Sixteen  devices  were  thus  stressed.  Periodically  testing  would  be  halted  to 
read  the  temporary  and  permanent  failure  arrays. 


Utilizing  a  4MHz  crystal  the  cycle  time  for  the  software  generated  signals  was 
64  uS  which  is  a  frequency  of  less  than  16KHz.  While  this  is  relatively  slow  it 
resulted  in  15625  read  cycles  per  second  which  was  488  reads  of  the  complete 
memory  per  second.  Per  day,  the  complete  memory  was  read  over  42  million 
times.  In  the  course  of  a  168  hour  (7  day)  burn-in,  all  memory  addresses  would  be 
read  in  excess  of  295  million  times.  For  a  1000  hour  life  test,  the  number  of 
complete  reads  is  upwards  of  1.8  billion  cycles.  These  cycles  were  performed  on  all 
sixteen  devices  under  test.  These  numbers  are  summarized  in  Table  2. 

TABLE  2:  MNOS  READ  CYCLE  COUNTS 


TIME 

READ  CYCLES 

COMPLETE  MEMORY  READ  CYCLES 

SEC 

15625 

488 

MIN 

937500 

29297 

HR 

5.62  x  107 

1.76  106 

DAY 

1.35  x  109 

4.22  x  107 

WK  (168  Hrs.) 

9.45  x  109 

2.95  x  108 

1000  HR 

5.62  x  1010 

1.76  x  109 

64  uS  Cycle  Ti 
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The  address  sequence  in  this  MNOS  experiment  was  sequential.  It  need 
not  have  been.  The  program  utilized  an  array  to  store  the  address  sequence.  A 
different  stored  sequence  such  as  all  odd  addresses  followed  by  all  evens,  or 
random,  or  some  other  generated  sequence  could  just  as  well  have  been  used.  The 
changes  could  have  been  quickly  effected  since  the  pattern  source  and  sequence 
were  software  controlled.  Likewise,  the  designation  of  first  occurrence  of  failure 
as  transient  and  repeat  occurrence  as  permanent  could  have  quickly  been  changed 
to  any  occurrence  or  mask  the  error  on  the  fifth  occurrence.  This  type  of 
flexibility  is  essential  to  stress  testing  complex  parts  and  is  easily  supported  with 
the  use  of  a  software  programmable  system  including  a  software  programmable 
pattern  source. 

The  total  software  used  in  the  stress  test  of  these  memories  (exclusive  of  the 
ROM  base  monitor  routines)  required  only  380  bytes  of  memory.  This  included  an 
address  array,  reference  data  array,  error  mask  array,  temporary  and  repeat 
failure  arrays,  and  coding  to  perform  the  control  logic  signal  generation  and  data 
analysis  to  identify  the  failures. 

Table  3  lists  some  of  the  recognized  advantages  and  Table  4  the  disadvantages 


of  this  particular  test.  The  basic  conclusion  can  be  drawn  that  for  some  stress 
testing  applications  such  a  simple,  inexpensive  set  up  is  practical,  and  offers  many 
advantages  to  the  test  engineer. 


TABLE  3:  MNOS/EV ALUATION  KIT  ADVANTAGES 


Functional  monitoring  of  every  DUT 

Software  programmable  exercise  pattern 

Software  programmable  data  acquisition  and  analysis 

Fault  isolation  to  device  and  address 

Temporary  or  repeat  fault  determination 

Error  mask  for  repeat  faults 

Boolean  operations  supported  by  uP 

Simple  hardware-Simple  debug 

Single  board  uP  system 

Effective 

Cheap 

TABLE  4;  MNOS/EV  ALUATION  KIT  DISADVANTAGES 


Machine  language  programming 
Low  clock  rates 

Controlled  only  DUT  exercise  signals 

While  the  evaluation  kit  proved  successful  for  the  MNOS  program  it  was  also 
clear  that  for  other  test  situations  more  computer  technology  would  be  required. 
The  next  application  of  computers  involved  incorporating  a  minicomputer  into  a 
stress  test.  The  chosen  unit  was  specifically  designed  for  measurement  and  control 
applications  in  an  industrial  or  laboratory  environment. 
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Some  of  the  considerations  of  utilizing  the  particular  minicomputer  in  the 


stress  test  are  summarized  in  Table  5. 

TABLE  5:  MINICOMPUTER  ADVANTAGES 


Basic  design  for  measurement  and  cor.,  ol  applications. 

Card  cage  approach  with  uncommitted  bus  structure  (16  slots) 
Digital  and  analog  input  and  output 
32 K  RAM  expandable 

Magnetic  tape  cartridge  for  programs  and  storage 
High  order  language  (BASIC) 

Real  time  clock 

Multiple  RS232  compatible  parts 

Thermocouple  capability 

IEEE  488  interface  capability 

Auto  restart  after  power  loss 

Multitask  operating  system 

Self  contained  terminal  and  display 

A  key  feature  of  instrumenting  the  test  process  has  to  be  flexibility.  This 
involves  the  type  and  amount  of  hardware  needed  to  support  the  testing.  A  card 
cage  with  a  common  bus  capable  of  being  configured  for  the  test  specific  number 
of  analog  and  digital  channels  is  desirable  to  allow  easy  tailoring  of  the  system  to 
the  particular  needs  of  the  test.  In  this  case,  the  computer  cabinet  housed  all 
necessary  system  power  supplies,  computing  logic,  and  a  sixteen  slot  card  cage. 
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The  card  cage  was  capable  of  being  filled  by  any  mix  of  digital  in,  digital  out,  A/D 
multiplexer,  D/A  converter  cards,  or  other  specialized  interfaces  with  the 
exception  of  one  timer  card  which  had  a  particular  slot  requirement. 

The  primary  application  was  to  monitor  and  control  an  EOS/ESD  experiment 
on  401  IB  CMOS  microcircuits  to  study  time  dependent  latent  defects.  The  test 
setup  is  shown  in  Figure  11.  The  devices  under  test  were  statically  biased.  Each 
input  to  each  device  and  the  power  to  each  device  were  controlled  by  a  drive  board 
which  was  under  the  control  of  the  computer.  The  primary  function  of  the 
computer  was  to  control  the  drive  boards,  monitor  the  leakage  currents  in  both  VSS 
and  of  each  device,  and  analyze  and  store  the  data.  Data  arrays  were 

constructed  of  part  numbers,  input  vectors,  measurements,  and  times  of  detection 
for  parts  exceeding  prescribed  deltas.  The  arrays  were  stored  to  the  tape  cartridge 
either  when  full  or  hourly.  In  this  way  data  was  maintained  on  the  devices  with  no 
more  thjan  a  one  hour  lapse.  The  device  boards  were  equipped  with  voltage 
references  to  calibrate  the  A/D  measurement  for  the  circuit  induced  offsets. 
Comparison  with  measurements  made  on  an  automated  microcircuit  test  system 
showed  that  measurements  of  300  nA  or  more  were  reasonably  accurate  while 
multiple  measurements  below  even  150  nA  could  be  recognized  to  represent  trends. 

The  experiment  had  three  test  cells;  two  for  elevated  temperatures  and  one 
for  room  temperature  testing.  The  drive  circuitry  for  each  cell,  illustrated  in 
Figure  12,  involved  supporting  60  devices  under  test  and  controlling  a  total  of  480 
inputs  and  60  lines,  all  independently,  and  120  separate  differential 

measurements  points  for  and  V^.  The  digital  control  required  16  digital 
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Figure  II:  FOS/FSD  Stress  Tost  Hurffwaro 


OUT  POWER 


output  channels  per  cell,  2  analog  in  channels,  and  16  digital  input  channels  for  bus 
integrity.  Each  drive  board  is  comprised  of  200  active  devices  with  480 
connections  and  about  900  feet  of  #30  wire.  Each  connection  to  each  device  under 
test  consists  of  three  additional  connectors, and  five  foot  of  cabling  to  finally  place 
the  signal  at  the  device  socket.  While  the  regularity  of  the  circuitry  is  high  the 
debug  and  trouble  shooting  problems  were  involved.  However,  the  use  of  the 
computer  and  a  test  jig  which  plugged  into  the  DUT  socket  identified  missing, 
broken,  or  misplaced  wires,  bent  pins,  and  open  or  shorted  connectors  and  cables 
allowing  rapid  fault  detection.  Furthermore  the  full  evaluation  could  be  made  each 
time  the  socket  was  emptied  for  a  device  measurement  and  therefore  better 
control  over  the  hardware  was  maintained.  The  test  jig,  shown  in  Figure  13,  is 
composed  of  a  14  pin  dip  extension  cable  which  was  plugged  into  the  device  under 
test  socket  linking  the  signals  supplied  to  the  DUT  socket  by  the  drive  board  under 
the  control  of  the  digital  output  of  the  computer  back  to  the  digital  input  of  the 
computer.  This  provided  a  go/ no  go  test  for  all  8  inputs,  the  presence  of  power  in 
the  line,  and  a  check  on  the  existance  of  a  connection.  By  controlling  the 
output  of  the  drive  board  and  comparing  the  received  signals  to  the  expected 
response  the  various  faults  were  detected. 

A  major  concern  of  adding  lots  of  support  hardware  to  a  testbed  is  lots  of 
support  hardware  failures.  The  incorporation  of  automation  needs  to  stress  the 
automation  of  the  diagnostics.  The  computer  can  assist  in  failure  recognition, 
location,  and  resolution  of  the  problem  both  during  the  maintenance  phases  and  the 
actual  stress  test.  This  is  not  a  trivial  advantage.  Because  real  time  information 
on  the  DUTs  was  available  in  this  test  it  was  possible  to  recognize,  locate,  and 
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repair  support  hardware  failure  during  the  test  restoring  the  system  to  the  intended 
stress  conditions.  The  computer  also  can  provide  information  to  assess  the  possible 
affects  of  the  induced  stresses  and  duration  that  the  DUTs  experienced  with  the 
failure.  Without  monitoring,  the  loss  of  signals  can  go  undetected  affecting  up  to 
six  months  worth  of  product  even  if  MIL-STD-883  continuity  provisions  are  adhered 
to.  Even  the  minimal  automated  diagnostics  applied  in  this  program  made  a 
significant  contribution  to  the  sucessful  test. 

Other  diagnostics  consisted  of  verifying  the  16  bit  wide  digital  control  bus 
which  daisy  chained  through  the  drive  board.  This  check  could  be  done  without 
affecting  the  status  of  the  drive  board  and  verified  continuity  of  the  control  bus. 
Analog  diagnostics  were  also  generated  to  calibrate  the  analog  circuitry  utilizing 
the  on  board  reference  channels  (eight  per  drive  board),  and  using  known  loads  in 
the  DUT  sockets. 

The  use  of  the  computer  facilitated  the  documentation  of  the  hardware  status. 
Hard  copy  reports  indicated  the  diagnostics  findings  and  allowed  corrective  action 
to  be  annotated  thus  providing  a  record  of  how  the  fault  manifested  itself,  where 
the  fault  was  located,  and  the  occurrence  of  a  repair.  Figure  14  is  a  typical 
diagnostic  report  generated  with  the  socket  test  jig  for  the  digital  logic.  Figure  15 
illustrates  an  analog  diagnostic  report  detailing  offsets  in  the  analog  circuitry. 
Figure  16  illustrates  an  analog  calibration  with  known  loads  in  Bank  0  Devices  5,  6, 
and  7,  and  Bank  1  Devices  0  and  1  locations. 
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EOS/ ESC  EXPERIMENT 
10/05/82  15:05:31 


BOARD  NO . ■  2  TEMPERATURE*  22  C  TAPE  A  FILE  2 

DATE  10  / 5  START  TIME  14  :57  :  1  7  STOP  TIME  15  :2  :47 

START  BANK  0  STOP  BANK  7 
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0.  NA 
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0.  NA 
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Figure  16:  Typical  Analog  Calibration  Report 
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The  real  time  knowledge  of  the  testbed  made  possible  the  detection  of  various 
fluctuations  otherwise  not  observable  and  also  the  means  of  tracking  and 
correlating  various  events.  In  one  situation,  leakage  measurements  were 
occasional lynoted  many  of  which  exceeded  the  delta  limit.  Various  correlations 
with  power  supply  drift,  line  voltage  transients,  oven  temperature  variations,  and 
computer  temperature  variations  were  examined  but  showed  no  correlation.  The 
culprit  turned  out  to  be  spurious  radiation  from  the  high  voltage  horizontal  sweep 
in  a  CRT  terminal  associated  with  another  computerized  testbed.  When  the 
terminal  was  turned  on  to  monitor  progress  of  the  other  experiment  interference 
would  produce  higher  than  normal  currents  measurements.  When  the  terminal  was 
turned  off  the  measurements  returned  to  their  previous  values.  There  were  no 
residual  effects. 

Suppose  that  there  was  a  permanent  effect  induced  in  the  components.  The 

automated  monitored  test  would  detect  that  and  assist  in  the  source  identification 

and  correction.  If  the  testbed  were  not  monitored  however,  the  permanent  damage 

would  have  been  induced  but  not  detected  until  the  parts  were  subjected  to  AMTE 

testing  at  the  end  of  the  stress  period;  perhaps  1000  hours  later.  It  would  not  be 

known  that  a  problem  unrelated  to  the  quality  of  the  parts  had  occurred  and  it 

would  be  very  difficult  to  determine  what  did  happen  for  that  test.  The  tester 

would  indicate  damage.  Did  that  damage  occur  during  the  controlled  stresses,  on 

the  tester  itself,  or  in  the  handling  of  the  device  between  the  test  socket  and  the 

AMTE?  Is  the  damage  test  induced  or  a  problem  of  the  device  population  being 

tester?  Data  gathered  while  the  part  is  in  the  chamber  would  identify  whether  the 

damage  occurred  during  tuc  controlled  stressing  or  subsequent  to  that  time  and 

indicates  the  time  distribution  of  device  failures. 
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It  is  important  to  know  if  the  test  procedures  are  inducing  faults  so  that  the 
lab  procedures  and  personnel  can  be  corrected,  not  the  device  manufacturer.  The 
cost,  complexity,  and  quantity  of  devices  we  test  cannot  afford  specious  failures. 

The  computer  made  possible  more  complex  algorithms  to  be  used  in  testing  the 
parts.  Oxide  breakdown  was  expected  as  a  primary  fault  and  gradually  increased 
leakage  due  to  charged  particle  accumulation  such  as  ionic  drift  was  the  other 
main  expected  failure  mechanism.  The  device  was  a  quad,  two  input  NAND. 
Initial  input  patterns  were  chosen  to  bias  one  gate  with  all  ones,  one  gate  with  both 
zeros,  one  gate  with  a  zero  and  one  pattern,  and  the  other  with  a  one  and  zero 
pattern.  The  pattern  was  rotated  such  that  the  same  input  on  any  4  consecutive 
devices  had  a  different  pattern.  Thus,  the  initial  input  word  for  every  fourth 
device  was  the  same;  sequential  devices  were  different  (See  Figure  17). 
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Figure  17:  Initial  Input  Vector  Rotation 
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The  computer  maintained  the  previous  acceptable  current  readings  and  the 
input  vectors.  On  a  polling  basis  the  I<^  and  currents  of  each  device  were 
measured,  if  the  computer  determined  a  current  reading  in  excess  of  the  delta 
limit  it  would  store  the  information  and  update  the  reference.  It  jf  exceeded  the 
failure  definition  the  computer  would  use  an  algorithm  to  search  for  an  input 
condition  which  would  produce  a  leakage  within  the  limit  allowing  stressing  to 
continue.  The  algorithm  changed  each  of  the  eight  inputs  independently.  Ml 
possible  input  conditions  for  each  gate  were  thereby  checked  with  just  thirteen 
vectors.  With  the  algorithm  used, a  single  stuck  at  fault  could  always  be  resolved. 
Multiple  faults  were  sometimes  resolvable.  Should  there  be  none  of  tne  thirteen 
input  vectors  which  gave  leakage  currents  below  the  failure  limit,  the  device  inputs 
were  programmed  to  ground  and  the  power  to  that  particular  device  was  shut 
off.  The  appropriate  information  was  stored.  The  device  could  be  repowered  at 
some  future  time  and  rechecked.  In  this  way  the  part  could  be  baked  unbiased.  If 
the  fault  was  due  to  ionic  drift  the  absence  of  an  electric  field  in  the  presence  of 
the  elevated  temperature  should  result  in  baking  out  the  problem  (failure  analysis 
technique).  If  the  fault  did  not  clear  with  the  unbiased  bake  then  the  indication 
would  be  an  oxide  short. 

Particularly  for  CMOS,  the  leakage  currents  are  the  key  parameters  indicating 
failure.  By  measuring  the  1^^  and  Iqq  leakage  parameters,  information  could  also 
be  gained  on  the  location  of  the  fault.  Equivalent  currents  measured  for  both  1^ 
and  indicated  that  either  both  the  upper  and  lower  input  protective  networks 
were  damaged  or  that  the  input  was  an  open  circuit  causing  the  input  node  to  float 
to  an  intermediate  value  biasing  both  n  and  p  transistors.  The  size  and  constancy 
of  the  currents  would  be  helpful  in  distinguishing  these  conditions. 
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If  the  and  l^  measurements  were  different,  input  damage  was  suspected. 
If  IpjD  were  greater  than  1^  and  that  condition  existed  when  a  given  input  was  low 
but  not  high,  the  faulty  input  would  be  identified  and  the  lower  input  protective 
network  or  the  N  channel  oxide  would  be  suspect.  Hence,  not  only  is  the  fault 
detectable  but  the  computer  is  able  to  do  some  preliminary  failure  analysis  to 
localize  the  fault. 

In  the  EOS/ESD  program,  should  a  current  exceed  the  software  programmable 
failure  criterion,  the  computer  would  search  for  a  different  input  vector  that  would 
allow  the  test  to  continue.  The  parts  could  then  stay  in  the  chamber  unhandled 
until  the  duration  of  the  particular  stress  has  elapsed,  a  failure  rate  has  been 
achieved,  a  failure  free  period  has  elapsed,  or  sufficient  aging  has  occurred  to 
warrant  AMTE  testing.  Burn-in,  life  test,  or  accelerated  stress  "to  order" 
accomplished  on  an  individual  lot  by  lot  basis  is  possible.  Should  the  particular  lot 
being  tested  have  excessive  fallout  the  test  can  be  terminated  as  soon  as  that  is 
recognized.  If  a  given  lot  should  take  some  percentage  of  time  longer  to  achieve 
the  desired  failure  rate,  the  parts  could  remain  under  the  stress  conditions  until  the 
target  is  achieved,  elininating  the  time  and  handling  involved  in  sequential 
assessing  that  situation  on  the  AMTE.  Optimized  stressing  will  mean  optimized 
usage  of  the  stress  hardware  (ovens  and  boards,  etc.),  manpower,  and  AMTE  while 
providing  higher  quality  and  confidence  with  reduced  total  costs. 

As  part  of  the  EOS/ESD  test,  half  of  the  devices  were  periodically  removed 
from  the  test  cells  and  measured  on  the  AMTE.  Because  the  stress  test  electrical 
conditions  in  this  case  were  automated,  the  start  up  and  stop  procedures  could  be 
the  same  each  time.  Test  conditions,  sequences,  and  duration  can  be  controlled 
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such  that  comparison  tests  between  lots  or  between  vendors  of  "equivalent"  parts 
may  be  more  accurately  performed.  Power  supply  sequencing  and  voltages  would  be 
the  same.  Chamber  temperature  and  ramp  rates  would  be  the  same.  Bringing 
these  and  the  many  other  test  conditions  such  as  exercise  patterns,  etc.,  under 
control  removes  these  variables  from  affecting  the  test  outcome. 

Table  6  lists  some  of  the  recognized  advantages  of  using  the  minicomputer  in 
the  EOS  test  program  and  Table  7  the  disadvantages. 

TABLE  6:  EOS/MINICOMPUTER  ADVANTAGES 


High  Level  Language  Programming 
Variety/Flexibility  of  Hardware  Configuration 
Automated  Diagnostics 
Drive  Electronics  Failure  Detestion 
Automated  Data  Analysis 
Automated  Input  Stress  Adjustment 
Increased  Data  and  Program  Storage 
Frequent  Repeatable  Data  Acquisition 
Structured  Power  Up/Down 
Reduced  Handling  of  Devices 

TABLE  7:  EOS/MINICOMPUTER  DISADVANTAGES 

Limited  Data  Storage 
No  Simultaneous  User  Capability 
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There  has  been  accomplished  a  third  phase  to  the  smart  stress  system 
development  at  RADC.  The  evaluation  kit  is  found  to  be  limited  in  usefulness 
though  the  hardware  is  very  inexpensive.  The  use  of  the  lab  computer  is  quite 
flexible  but  even  though  it  has  more  storage  capacity  it  still  had  limited  data 
storage  capability,  no  ability  to  support  more  than  one  program  at  a  time,  and  was 
more  expensive  than  necessary. 

An  alternative  approach  has  been  to  develop  a  distributed  processing  intralab 
network.  The  network  consists  of  a  central  computer  managing  shared  resources 
and  directing  the  stress  tests  via  control  of  small  satellite  computers  at  each  stress 
chamber.  The  central  controller  is  also  used  for  data  analysis  and  program 
development.  The  satellite  test  systems  are  concerned  with  the  localized  control 
of  the  test  such  as  actual  exercise  of  the  DUT's  and  data  acquisition.  The  central 
machine  maintains  the  mass  storage,  printer  and  other  output  devices,  and 
overhead  functions  allowing  data  analysis  to  be  done  without  interfering  with  the 
testing. 

Because  the  nontest  related  resources  are  available  via  the  network,  the 
required  capabilities  of  the  satellites  are  reduced.  The  smaller  satellite  test 
systems  are  configured  around  the  commercially  well  supported  STD  BUS.  The  bus 
i'  •'Hular  and  represents  an  excellent  balance  between  simplicity  and  flexibility. 

The  standard  bus  has  many  companies  which  supply  A/D,  D/A,  digital  input  and 
digital  output  capabilities,  memory  and  support  functions  such  as  real  time  clocks, 
graphics,  EPROM  programmers,  etc.  Microprocessors  currently  used  in  the 
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satellites  are  the  4MHz,  Z80A  and  INS8073.  Such  a  system  as  used  in  the  ASF  is 
shown  in  Figure  18. 

The  8073  based  system  is  a  single  board  computer  which  has  digital  I/O,  RAM, 
EPROM,  an  EPROM  programmer,  industrial  BASIC  including  logical  operations, 
utility  routine,  and  a  realtime  clock.  With  the  exception  of  interface  circuits  to 
accommodate  non-TTL  and  increased  drive  levels  the  electronics  is  complete  for 
many  small  testing  program  requirements. 

The  single  board  capabilities  and  STD  BUS  configuration  also  allows  easy 
incorporation  of  additional  capabilities  such  as  A/D,  etc.,  in  the  uncommitted  STD 
BUS  card  cage  approach. 

This  simple  single  board  system  in  conjunction  with  the  intralab  network 
addresses  many  of  the  limitations  experienced  with  the  evaluation  kit  and 
minicomputer  approaches.  Capabilities  to  store,  manipulate,  and  analyze  data, 
support  multiple  programs  and  programming  tasks  are  not  necessary  for  the  test 
systems  at  the  chambers  to  support.  The  distributed  intelligence  and  resource 
approach  allows  the  uP  at  the  chamber  to  be  solely  responsible  for  performing 
testing  with  only  some  preliminary  data  analysis  while  off  loading  the  bulk  resource 
and  non-testing  requirements. 

In  the  network  the  satellite  processors  are  tied  via  a  bidirectional  link  allowing 
data  transfer,  program  transfer,  and  interaction  of  the  central  machine  when  the 
needs  exceed  the  capabilities  resident  at  the  chamber.  As  an  example  the  central 
controller  downloads  the  test  algorithm  to  the  satellite  test  processor  which 
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performs  the  test  until  completion,  updating  by  the  central  processor,  or  fault 
detection.  Upon  fault  detection,  the  satellite  establishes  the  time  and  conditions 
of  the  fault  and  performs  some  limited  analysis.  If  the  satellite  can  resolve  the 
fault  sufficiently,  testing  will  resume  with  the  appropriate  documentation  and 
changes  to  the  test  conditions  having  been  made.  If  the  satellite  processor  cannot 
sufficiently  analyze  and  resolve  the  fault  or  requires  other  assistance  it  signals  the 
central  controller.  The  central  controller  can  respond  by  storage  of  the  data, 
analysis  of  the  situation  and  downloading  of  additional  programs  specifically 
targetted  to  help  identify  the  location  of  the  fault  including  routines  to  verify 
proper  operation  of  the  satellite.  Upon  completion  of  analysis  routines  the  central 
controller  will  return  the  satellite  to  testing  perhaps  with  a  new  program  or  the  old 
program  updated  with  error  masking,  etc. 

Such  a  distributed  intelligence  system  identifies  levels  of  responsibility 
allowing  specialization  of  the  hardware,  increases  the  use  of  resources  such  as 
mass  storage  which  can  be  shared,  and  supports  data  analysis  and  reduction  off 
from  the  testing  process.  The  central  controller  with  a  multi-user  operating 
system  is  not  involved  full  time  with  testing  as  are  the  satellites.  It  can  support 
the  overhead  management  functions  and  service  the  multiple  satellites. 

The  system  as  developed  at  RADC  utilizes  a  6800  based  minicomputer  with 
6UK  RAM  with  dual  floppy  disks  as  the  central  processor.  Common  resources  are 
enhanced  by  a  real  time  clock  providing  date  and  time  of  day.  Also  attached  to  the 
central  machine  is  a  printer  and  a  uP  development  system  with  dual  floppy  disks. 
A  microprocessor  development  system  has  been  modified  to  support  color  graphics 
display  of  data  gathered  from  the  networked  satellites.  The  network  is  configured 
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around  the  RS232C  standard  which  is  very  common  and  easily  utilized.  The  system 
is  illustrated  in  Figure  19.  The  central  controller  runs  in  BASIC  as  do  all  the 
satellite  systems  to  facilitate  programming.  The  central  controller  operating 
system  has  been  modified  to  service  the  satellites  while  programming  or  data 
analysis  is  being  run  from  the  keyboard. 

In  practice  the  system  has  been  designed  and  assembled  in-house.  The  central 
computer  operating  system  modifications  and  graphics  capabilities  are  very  crude 
and  do  not  support  multiple  users.  The  entire  system  is  unique  and  hence  has  to  be 
uniquely  maintained.  But  while  the  capabilities  are  not  sophisticated  or  complete 
enough  to  support  the  exploitation  of  computerized  testing,  they  have  been 
adequate  to  indicate  some  of  the  tremendous  advances  that  can  be  made  by 
automating  the  stress  testing  process  and  have  shown  the  flexibility,  utility,  and 
cost  effectiveness  of  the  distributed  processing,  shared  resource,  networked 
approach. 

Just  as  the  integration  of  computers  with  electrical  testing  allows  the  AMTE 
which  are  common  place  today,  there  is  much  productivity  to  be  gained  in  stress 
testing  from  the  integration  of  computers.  The  future  will  bring  more  and  more 
integration  of  computers  into  the  stress  test.  The  whole  of  reliability 
characterization  will  be  greatly  improved  and  extended  through  the  intelligent 
applications  of  automation.  In  fact,  reliability  characterization  of  coming  devices 
will  not  be  possible  without  the  extensive  use  of  computer  technology. 
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SATELLITE 


IV.  Future-Present 


This  brings  us  to  the  future-present.  Increasingly  complex  devices  are 
increasingly  complex  to  characterize  yet  there  isn't  enough  time  or  money  in  the 
budget  to  acquire  the  necessary  information  with  the  standard  tools.  It  is  evident 
that  automation  of  the  stress  test  has  much  to  offer.  It  is  evident  that  automation 
of  the  entire  reliability  characterization  process  has  much  to  offer.  It  is  evident 
that  automation  is  essential.  After  all,  it  is  only  through  automation  that  these 
complex  devices  are  possible  in  the  first  place. 

Automated  stress  systems  are  commercially  available  today  that  provide 
control  of  the  temperature,  voltages,  clock  rates,  patterns,  and  sequencing  as  well 
as  monitoring  of  the  DUT  outputs  for  functionality.  These  systems  are  computer 
based  and  employ  distributed  intelligence  and  shared  resources  interconnected  by  a 
network.  RADC  is  planning  to  acquire  such  a  system. 

Commercially  available  systems  today  which  perform  monitored  testing  do  so 
based  on  monitoring  the  functionality  of  the  DUTs.  The  leakage  monitoring  as  in 
the  EOS/ESD  program  is  an  important  enhancement  of  automated  monitored 
testing  but  one  which  is  not  yet  commercially  available.  Failure  is  often  defined  in 
terms  of  stability.  Unstable  parts  are  indicative  of  poor  process  control  and  likely 
to  result  in  system  failure.  Particularly  for  CMOS,  leakage  currents  are  the 
primary  indicators  of  instability.  Good  devices  will  have  very  low  leakage. 
Essentially  any  degradation  will  show  up  as  an  increased  leakage.  Practical  CMOS 
failure  analysis  begins  with  examining  the  supply  and  input  leakage  measurements. 
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The  implementation  of  leakage  current  measurements  involves  incorporating 
analog  to  digital  converter  capability  into  the  computer  based  test  bed.  Multiple 
channel,  twelve  bit,  programmable  gain,  computer  compatible  A/D  boards  are 
readily  available,  and  inexpensive,  and  easy  to  utilize.  The  distributed 
microprocessor  based  architecture  of  the  current  systems  will  greatly  facilitate 
this  task. 

Combining  the  functional  and  parametric  monitoring  capabilities  with  the 
ability  to  control  the  power  supply  and  input  voltages,  temperature,  frequency, 
patterns,  etc.,  with  the  data  handling  capabilities  of  a  computer  will  make  possible 
more  extensive  characterization  in  the  chamber. 

Automation  of  the  stress  test  is,  however,  only  one  part  of  what  is  necessary 
to  evaluate  complex  devices.  It  is  important  to  know  what  failure  mechanisms  to 
expect,  where  to  expect  them,  how  to  stress  for  them,  and  how  to  recognize  their 
occurrence. 

One  area  that  is  seemingly  not  being  addressed  is  that  of  tapping  the  design 
database  generated  through  the  CAD/CA M/CAT  work  to  support  the  stress  test. 
Effort  has  focused  on  making  use  of  that  database  for  test  vector  generation  for 
AMTE.  Stress  testing  needs  that  database  even  more.  The  old  standby  step  stress 
attempts  to  find  the  limits  where  devices  stop  working  due  to  vanishing  margins. 
These  margins  can  be  design  margins  or  physical  margins.  Because  of  the 
complexity  and  numbers  of  devices  it  will  be  necessary  to  be  smart  about  how  to 
test.  The  stress  test  must  address  those  areas  of  least  margin  and  greatest 
sensitivity  to  activate  the  failure  mechanisms  most  effectively.  The  design 
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database  should  be  tapped  to  predict  those  areas  most  likely  to  incur  the  failure 
mechanisms  and  those  stress  scenarios  most  likely  to  stimulate  them.  These 
models  should  address  effects  of  voltages,  temperature,  currents  and  current 
densities,  noise,  drive,  etc.  These  margins  are  related  to  geometries,  doping 
profiles,  relative  positions,  etc.  Much  of  the  model  parameter  information  is  in 
the  design  database.  In  the  future,  computer  modelling  will  become  increasingly 
more  important.  Computer  simulations  will  be  performed  to  predict  the  effects  of 
failure  mechanisms  on  circuit  performance.  From  these  simulations  stress 
scenarios  will  be  constructed.  This  is  not  a  trivial  task.  Modelling  of  the  physics 
of  failure  will  be  required.  Effective  computer  models,  simulations,  and 
predictions  will  be  necessary.  We  must  be  smart  in  how  we  characterize  our 
devices. 

Another  area  for  modelling  is  failure  analysis.  The  size  and  multilayer 
construction  of  devices  makes  physical  location  and  verification  of  failures 
impossible  with  todays  methods.  By  using  the  model  in  the  design  database  the 
computer  can  generate  the  probable  failure  mechanisms  and  locations  which  would 
produce  the  observed  device  conditions.  We  must  be  smart  about  what  we  look  for 
and  where  we  look  for  the  problem. 

Recognizing  failures  also  needs  new  techniques.  Earlier  detection  can  lead  to 
increased  productivity.  In  the  late  sixties  sophisticated  data  analysis  software 
entitled  OLPARS  (On-Line  Pattern  Analysis  and  Recognition  System)  was 
developed  at  RADC.  Key  features  allowed  multidimensional  data  bases  to  be 
visualized  and  manipulated,  and  decision  boundaries  drawn  enabling  automatic 
classification  of  subsequent  data.  It  is  proposed  that  this  type  of  computer  analysis 
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be  done  on  the  databases;  particularly  those  generated  by  the  SMART  and  AMTE 
equipments.  Multidimensional  data  analysis  rnay  lead  to  recognition  of  the  elusive 


"precursors  of  failure."  Utilizing  decision  planes  learned  from  previous  data  sets  in 
n  dimensions  would  possibly  yield  earlier  determination  of  failure  or  instability. 
We  must  be  smart  in  how  we  analyze  the  data. 

Multidimensional  analysis  could  lead  to  smaller  data  bases.  Each  parameter 
tha:  is  measured  would  be  considered  as  a  dimension  in  the  multidimensional 
database.  Multidimensional  analysis  resulting  in  decision  boundaries  could  result  in 
identifying  those  dimensions  (parameters)  which  have  minimal  or  no  informational 
content.  If  applied  to  data  bases  such  as  MIL-M-38510  slash  sheet  tests  or 
qualification  data  gathered  by  DESC,  it  is  not  unreasonable  to  expect  that  the 
volume  of  parameters  measured  and  the  frequency  of  measurements  could  be 
reduced  making  the  process  quicker,  more  efficient,  and  more  effective. 
Automation  makes  available  tremendous  amounts  of  numbers.  Automation  must  be 
utilized  to  identify  the  information  and  extract  it.  Such  analysis  will  make  it 
easier  to  recognize  the  patterns  and  to  key  on  the  trends  that  lead  to  failure.  We 
must  be  smart  in  the  amount  of  data  we  handle  in  order  to  get  the  information  we 
need. 

Another  area  in  which  smart  testing  offers  advantages  is  in  3AN  requirement 
enforcement.  The  reduction  of  escapes  through  monitoring  has  already  been 
discussed.  The  area  of  test  documentation  can  be  improved.  Such  principles  as 
restricted  access  applied  to  smart  testing  could  reduce  the  problem  recently 
experienced  with  major  military  suppliers  cheating  on  the  screening  requirements. 
Documentation  kept  by  the  computer  identified  by  lot  and  serial  numbers,  with 
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dates,  times,  temperature  and  voltage  profiles,  escapes  information,  failure 
information,  etc.,  could  be  readily  maintained  and  delivered  with  each  lot.  This  is 
particularly  important  for  optimized  screens  where  the  details  of  the  stress 
conditions  vary  with  each  lot  to  achieve  the  target  reliability.  Computer  programs 
can  be  broken  into  and  data  and  records  adulterated  but  computer  security  with 
access  restricted  to  only  high  level  company  people  elevates  the  responsibility  and 
the  liability  for  deliberate  fraud. 

Other  applications  will  surely  surface.  SMART  is  only  one  portion  of 
reliability  characterization.  Each  portion  needs  to  be  automated  and  integrated. 
The  automation  of  the  stress  test  is  truly  smart  testing.  The  Air  Force  needs  the 
capabilities  of  smart  testing.  The  time  is  now. 
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V.  Conclusion 

New  stress  testing  techniques  are  a  response  to  the  need  to  evaluate  and 
characterize  the  wide  variety  of  microelectronics.  It  is  the  microelectronics 
industry  that  drives  the  stress  test  industry  and  that  industry  is  plunging  into  small 
volume,  custom,  high  density,  high  complexity  devices.  The  new  technologies  will 
not  be  characterized  with  the  old  techniques.  In  order  to  assess  reliability 
problems  and  provide  the  needed  answers  smart  testing  capability  must  be 
established.  The  integration  of  computers  into  the  stress  testing  portion  of 
reliability  characterization  is  essential.  The  potential  advantages  are  great. 


